Liu Bo-Yang, Zhao Xin, Dai Hong-Yi, Zhang Ming, Liao Ying, Guo Xiao-Feng, Gao Wei. The impact of honesty and trickery on a Bayesian quantum prisoners’ dilemma game. Chinese Physics B, 2020, 29(7): 070201
Permissions
The impact of honesty and trickery on a Bayesian quantum prisoners’ dilemma game
Liu Bo-Yang1, †, Zhao Xin2, Dai Hong-Yi3, 4, Zhang Ming2, Liao Ying1, Guo Xiao-Feng1, Gao Wei1
Information Engineering University, Zhengzhou 450001, China
College of Artificial Intelligence, National University of Defense Technology, Changsha 410073, China
Department of Physics, College of Science, National University of Defense Technology, Changsha 410073, China
Interdisciplinary Center for Quantum Information, National University of Defense Technology, Changsha 410073, China
Project supported by the National Natural Science Foundation of China (Grant Nos. 61773399, 61673389, and 61273202) and the Special Funded Project of China Postdoctoral Science Foundation (Grant No. 2017T100792).
Abstract
To explore the influence of quantum information on the common social problem of honesty and trickery, we propose a Bayesian model for the quantum prisoners’ dilemma game. In this model, the players’ strategy formation is regarded as a negotiation of their move contract based on their types of decision policies, honesty or trickery. Although the implementation of quantum information cannot eliminate tricky players, players in our model can always end up with higher payoffs than in the classical game. For a good proportion of a credibility parameter value, a rational player will take an honest action, which is in remarkable contrast to the observation that players tend to defect in the classical prisoners’ dilemma game. This research suggests that honesty will be promoted to enhance cooperation with the assistance of quantum information resources.
Cooperation among players is one of the most concerned problems in game theory.[1,2] A variety of concrete game models examine whether a Nash equilibrium solution is Pareto optimal,[3–6] and the critical research question is whether a mutually beneficial cooperation can be spontaneously formed and maintained among players. In classical non-cooperative games, both strategies and moves are decided individually and players’ rational strategies often end up with payoffs lower than what they can get through cooperation. In contrast to the aforementioned observation, strategies and moves could be correlated in quantum games even if players conduct no classical communications and make only local operations. The entanglement and interference of quantum information makes players’ decision-making a quantum process.[7,8] With the help of quantum correlation, players in quantum games can negotiate their moves for cooperation by using quantum information. This property makes it possible to make cooperative moves in a non-cooperative quantum game, thereby facilitating players to solve their classical dilemmas.[9–12]
From the viewpoint of players’ cooperation, we can generally divide a quantum game into two stages: strategy negotiation and move decision. At the stage of strategy negotiation, players conduct strategy operations on the local part of entangled qubits and develop together a quantum state as their move contract. At the stage of move decision, the state of such quantum contract are measured in certain bases to define players’ moves according to their decision policies. Previous studies on quantum games generally focus on one of these two stages. For instance, the Eisert–Lewenstein–Wilkens quantum game scheme[9,13–16] concentrates on the quantization of strategy negotiation process, where strategy operations are expanded from classical bit operators to unitary quantum gates, and the measurement basis for move decision is assumed to be constant. In contrast, players adjust the measurement basis in Marinatto’s scheme[17–21] and Bayesian quantum games[22–28] to optimize their move decision policies. Entanglement improves players’ payoffs and quantum uncertainty keeps the fairness of the game.[29] On the other hand, the stage of strategy negotiation is omitted and players’ move contract is restricted to be some given quantum states.
Based on our generalized two-stage quantum game model, we conduct an investigate into the influence of the tricky action on cooperation in this article. In Eisert’s model, a quantum cooperation contract is achieved by players with the assumption that they all honestly take the established decision policy. In Marinatto’s model and Bayesian games, players are considered to follow the given move contract in determining their moves. It appears that many new issues may arise in quantum games where both strategies and decision policies can simultaneously be determined by players. Here we follow Eisert’s model and concentrate on a new quantum prisoners’ dilemma game, where players are out of supervision. In this game, players may not faithfully follow the established decision policies, but take the tricky action of unilateral changing decision policies. It can be expected that such tricky action may lead to a player’s inconsistent moves and, hence, the collapse of existing Nash equilibria. From this perspective, we will carefully examine the Nash equilibria and corresponding strategy for cooperative move contract in quantum games with tricky players.
In the following, we will view our quantum prisoners’ dilemma model as a Bayesian game in order to clarify the difference between player’s strategy operations and the modification of decision policies. We consider that players with different decision policies are of different types, honest and tricky. A player has only incomplete information about the types of his/her opponents. Following previous studies on quantum prisoners’ dilemma, we take the standard orthonormal basis as players’ default move decision policy, and regard the change of decision policy as tricky action. By introducing and employing a parameter of credibility to depict the probability that the players are honest or tricky, we study the influence of the tricky action on the Nash equilibria and the cooperation between players, and derive mixed strategies for Nash equilibria under typical quantum strategy operations. In addition, we discuss the positive effect of quantum entanglement and coherence on classical social cooperation.
The rest of our manuscript is organized as follows. In Section 2, we give a detailed introduction to our incomplete information quantum game. The mixed strategy Nash equilibrium for this game is studied in Section 3. By comparing payoffs of different types of players, we show that players in the quantum game are more likely to be honest. Finally, concluding remarks are drawn in Section 4.
2. A Bayesian quantum prisoners’ dilemma game
The process of a typical two-player Alice and Bob 2 × 2 quantum game[9,17,23] can be characterized by Fig. 1(a). By conducting strategy operations UA and UB together with J and J† on qubits |ψ〉AB, players develop their move contract |ψ〉s = J † ( UA ⊗ UB ) J |ψ〉AB. This state is then mapped to actual moves by projective measurements. Generally, the default measurement basis is { |0〉, |1〉 } and it can be adjusted by unitary operations VA, VB according to the players’ decision policies. In previous quantization schemes, Eisert’s model[9] studies the case that players autonomously decide their own strategies UA, UB, imply the default decision policy VA = VB = I, and take strategy operations on the initial state |ψ〉AB = |0〉A ⊗ |0〉B. On the other hand, Marinatto’s quantization scheme[17] and Bayesian quantum games[22,23] focus on the adjustment of decision polices VA, VB, but omit the process of strategy negotiation, where players’ move contract is provided by the Einstein–Podolsky–Rosen (EPR) state |ψ〉AB = |ψ〉EPR as the decision advice. As shown in Fig. 1(b), the classical game model can be regarded as a special case of the quantum game, which requires no entanglement between players, |ψ〉AB = |ψ〉A ⊗ |ψ〉B and J = J† = I. Thus UA and UB, together with VA and VB, can be expressed as uniform strategy operators Wi ≡ UiVi, i = A, B. For classical operations Ui and Vi, strategy Wi is either the unit operator I or the Pauli x operator σx.
Fig. 1. (a) The sketch of a two-player 2 × 2 quantum game. Here J and J† are two-qubits entangling gates. UA and UB are strategy operators carried out by players Alice and Bob, respectively. MA and MB are projective measurements in bases { |0〉, |1〉 }. Through unitary operations VA and VB, players can adjust their measurement basis according to their decision policies. (b) When |ψ〉AB = |ψ〉A ⊗ |ψ〉B and J = J† = I, a two-player 2 × 2 quantum game is reduced to a classical game.
The implementation of quantum information can help players improve their payoffs in games. As a famous example, the prisoners’ dilemma is resolved when quantum strategies are available.[9] Table 1 presents the payoff matrix of this game. The game attains a Pareto optimal solution when both players employ strategies WA = WB = I, i.e., take the move contract |ψ〉S = |0〉A ⊗ |0〉B. This cooperative move contract, however, cannot be settled by rational players in the classical Prisoners’ dilemma game, as a player can always improve his/her payoff by unilaterally changing the strategy to Wi = iσy. As a result, the Nash equilibria will end up with players taking the move contract |ψ〉S = |1〉A ⊗ |1〉B, although the corresponding payoffs for both players are lower than those in the cooperative solution. In contrast, it is discovered by Eisert et al.[] that players can avoid the dilemma in the quantum game by employing the quantum strategy Ui = SQ = iσz. The contract |ψ〉S = – |0〉A ⊗ |0〉B leads to a Pareto optimal Nash equilibrium, with which both of the players will make move C.
Table 1.
Table 1.
Table 1.
Payoff matrix of the classical prisoners’ dilemma game. C (cooperation) and D (defect) are moves as that players Alice and Bob can make in the game.[9]
.
Bob\Alice
C
D
C
3, 3
0, 5
D
5, 0
1, 1
Table 1.
Payoff matrix of the classical prisoners’ dilemma game. C (cooperation) and D (defect) are moves as that players Alice and Bob can make in the game.[9]
.
In previous studies, quantum contracts are established on the assumption that both players take the default decision policy VA = VB = I. If the decision policy can be modified, players may break their cooperative contract again. As a typical case, we consider that player i can deceive his/her opponent by changing the decision policy from Vi = I to Vi = σx, and show the corresponding moves and payoffs in Table 2. For the quantum contract |ψ〉S = |0〉A ⊗ |0〉B, the tricky player will finally take move D and the honest player will select move C. The payoff of the tricky player would thus be improved. Nevertheless, if both players take the tricky action, they will eventually make move D in the game, leading to the decrease of payoffs as in the classical game. Consequently, players will now face new dilemmas about both their strategies and decision policies.
Table 2.
Table 2.
Table 2.
Moves and corresponding payoffs of different types of players {(mA,mB),($A,$B)}. Players’ moves are decided under move contract |ψ〉S = |0〉A ⊗ |0〉B. The decision policy of player is either PH = I (honest) or PT = σx (tricky).
.
Bob\Alice
PH
PT
PH
{(C, C), (3, 3)}
{(C, D), (0, 5)}
PT
{(D, C), (5, 0)}
{(D, D), (1, 1)}
Table 2.
Moves and corresponding payoffs of different types of players {(mA,mB),($A,$B)}. Players’ moves are decided under move contract |ψ〉S = |0〉A ⊗ |0〉B. The decision policy of player is either PH = I (honest) or PT = σx (tricky).
.
To address the aforementioned problems, we expand the quantum prisoners’ dilemma game[9] to a Bayesian version. In this model, the players should determine both their strategies and decision policies. Our investigation focuses on the non-entanglement initial state |ψ〉AB = |0〉A ⊗ |0〉B and the the maximally entangled operation J and J†. We consider that only three typical strategies, SC = I, SD = iσy and SQ = iσz, and two decision policies, PH = I and PT = σx, are available to players.
Table 3 reflects players’ payoffs corresponding to certain quantum strategy and decision policies. If the default decision policy is kept, Vi = PH = I, the player is deemed honest. Otherwise, if Vi = PT = σx, the player changes his/her decision policy and takes a tricky action. It is assumed that a player does not have definite information about the type of his/her opponents. Instead, a social credibility p is provided to them as common knowledge, with which the probabilities that a player being honest and tricky can be represented by p and 1 – p, respectively. In the following, we study the mixed strategy Nash equilibrium of this quantum Bayesian game under different credibility values.
Table 3.
Table 3.
Table 3.
Payoff matrix of the maximally entangled quantum Bayesian prisoners’ dilemma game. The quantum strategy operators available to players are SC = I, SD = iσy and SQ = iσz. In this game, players are either honest or tricky. PH and PT correspond to player honestly keeping the default decision policy or taking the tricky action, respectively.
.
Bob
PH
PT
SC
SD
SQ
SC
SD
SQ
Alice
PH
SC
3, 3
0, 5
1, 1
0, 5
3, 3
5, 0
SD
5, 0
1, 1
0, 5
1, 1
5, 0
3, 3
SQ
1, 1
5, 0
3, 3
5, 0
1, 1
0, 5
PT
SC
5, 0
1, 1
0, 5
1, 1
5, 0
3, 3
SD
3, 3
0, 5
1, 1
0, 5
3, 3
5, 0
SQ
0, 5
3, 3
5, 0
3, 3
0, 5
1, 1
Table 3.
Payoff matrix of the maximally entangled quantum Bayesian prisoners’ dilemma game. The quantum strategy operators available to players are SC = I, SD = iσy and SQ = iσz. In this game, players are either honest or tricky. PH and PT correspond to player honestly keeping the default decision policy or taking the tricky action, respectively.
.
3. Mixed strategy Nash equilibrium of quantum Bayesian prisoners’ dilemma game
In this section, we derive the Nash equilibrium of the quantum Bayesian prisoners’ dilemma game. The quantum cooperation contract and corresponding payoffs are analyzed under different social credibility p. According to Table 3, there does not exist pure strategy Nash equilibrium for all credibility p∈ [0,1] with operations Ui = Sm, m = C, D or Q. Consequently, we shall consider the contract of two players’ mixed strategy in this section. For the convince of discussion, we would express such a mixed strategy in the form of a density matrix rather than a sample state vector in the following.
In the following discussion, we denote the probabilities of player i taking operators Sm by (m = C, D or Q, and ). Here i = A, B corresponds to players Alice and Bob, respectively. ki = H stands for player i being honest, and ki = T for player i being tricky. Additionally, we will express the mixed strategy of each player by parameters sA and sB, where represents the probability that player i takes each of the corresponding strategy operators. With strategy parameters sA and sB, players’ move contract can be described by the following density matrix,
Here is the quantum move contract when Alice takes the operation Sm and Bob takes the operation Sn, and is the probability players take such an operation. According to the game process given in Section 2, can be calculated as follows:
In our discussion, we consider the non-entanglement initial state |ψ〉AB = |0〉A ⊗ |0〉B and the maximally entanglement operation J = exp(– iπσy ⊗ σy / 4).[9] By setting the state |0〉i = [1,0]T and |1〉i = [0,1]T, we can express for different strategy cases,
Hence ρT in Eq. (1) would be a diagonal matrix of four parameters
where
With the payoff matrix in Table 3, the expected payoff of player i under strategy ρT can be given as
where is a function of the strategy of the opponent player j,
We denote the Nash equilibrium mixed strategy of player i = A, B as
To ensure that no unilateral strategy modification can be taken to improve player i’s expected payoff, the following condition should be satisfied
In the following, we study the Nash equilibrium condition that strategy parameter should satisfy under different creditability value p. Our discussion is divided into three cases, each of which corresponds to a specific set of parameters , , and , respectively.
Players i = A, B can arrive at the Nash equilibrium according to Scheme 2a when creditability p ∈ [ 0, 1/4 ), With this scheme, an honest player will end up with an expected payoff , while a tricky player can expect a payoffs of . When creditability p ∈ (3/4, 23/25], the players can get to the Nash equilibrium with strategy in Scheme 2b. The expected payoffs are and for honest and tricky players, respectively.
When p ∈ [5/7,1], an honest player can get an expected payoff with strategy Scheme 3a, and a tricky player can get . For p ∈ [5/7,23/25), the implication of Scheme 3b will bring an expected payoff to an honest player, and to a tricky player.
In the aforementioned discussions, we have obtained mixed quantum strategies for Nash equilibrium under different creditability p. It should be noted that there are more than one set of Nash equilibrium strategy in some regions of p. In these cases, rational players would prefer the strategies that maximizes the sum of the expected payoff of both honest and tricky players, . For example, When p is located at [5/7,3/4], there are three probable Nash equilibriums, which are given in Scheme 1, Scheme 3a, and Scheme 3b, respectively. Comparing Scheme 3a with Scheme 3b, we find that trickery players has the same strategy in two cases, while honesty players has a better payoff in the equilibrium reached in Scheme 3a. So players will not get their Nash equilibrium according to the strategy given in Scheme 3b. On the other hand, if a player develop the strategy according to Scheme 1, the opposite will gain the same payoff whether his/her strategy is conducted according to Scheme 1 or Scheme 3a. But if a player develop the strategy according to Scheme 3a, the opposite will gain higher payoff when he/she also develop the strategy according to Scheme 3a. As a result, only Scheme 3a is the possible case for Nash equilibrium. When p is located at [5/7,3/4], players will develop strategy scheme . At that time, honest and tricky players have different payoffs. Similarly, we can analyze the case p in [3/4,1]. Nash equilibria strategies under different parameter p are summarized in Table 4.
Table 4.
Table 4.
Table 4.
The Nash equilibrium mixed strategies and expected payoffs for creditability p ∈ [0,1]. The payoff of an honest player is always higher than a tricky player for p ∈ [0,1/4). With p ∈ [1/4,5/7), honest and tricky players will get the same expected payoff. Only when creditability p ∈ [5/7,1] will the payoff of an honest player be exceeded by a tricky player. When p ≤ 5/7, a rational player would not benefit from taking a tricky action.
.
p
i
0
0
1
0
0
1
0
0
1
3p
1 + 4p
1 + (3 – p)p
Table 4.
The Nash equilibrium mixed strategies and expected payoffs for creditability p ∈ [0,1]. The payoff of an honest player is always higher than a tricky player for p ∈ [0,1/4). With p ∈ [1/4,5/7), honest and tricky players will get the same expected payoff. Only when creditability p ∈ [5/7,1] will the payoff of an honest player be exceeded by a tricky player. When p ≤ 5/7, a rational player would not benefit from taking a tricky action.
.
4. Discussion
As mentioned in Section 2, rational players of the classical prisoners’ dilemma game always break the cooperation contract |ψρS = |0 ρA ⊗ |0 ρB to maximize their individual payoffs. Although the contract |ψ ρS = |1 ρA ⊗ |1 ρB can be sustained by players, it only leads to lower payoffs. Our study indicates that the cooperation between players can be achieved more easily with the assistance of quantum information in quantum prisoners’ dilemma game. From Table 4 and Fig. 2, we see that although quantum information cannot eliminate tricky actions, it can provide the prisoners’ dilemma game with a negative feedback of creditability. This mechanism depresses the profit from tricky action with the decrease of creditability. Especially when p ∈ [0,1/4), tricky players will get lower payoff than honest ones, . Given that the social creditability parameter p, the probability that a player keeps his/her contract, our quantum game analysis reveals that the creditability can be as high as 5/7. In contrast, the creditability will always be 0 in the classical prisoners’ dilemma game where no cooperation contract can be sustained by players.
Fig. 2. The expected payoffs of tricky player , honest player and their difference under different creditabilities p. It is shown that the difference of payoffs drops with the decrease of creditability for p ∈ [0,1]. When p ∈ [0,1/4), the payoffs of tricky players will be lower than honest players’, .
When p ∈ [0,1/4), an honest player will always conduct quantum operation SQ. Thus the quantum prisoners’ dilemma game between two honest players will end up with the Pareto optimal Nash equilibria, where payoff . When p ∈ [1/4,5/7), players’ payoffs remains constant at $i = 9/4 regardless of their being honest or tricky. At last, when p ∈ [5/7,1] the expected payoff is at least for an honest player, and for a tricky player. In all these cases, payoffs in quantum games are higher than the classical prisoners’ dilemma game, $i = 1. As a result, although our quantum game may suffer from tricky actions, it is always advantageous over the classical game in terms of players’ payoffs.
It should be noted that we have made full use of quantum entanglement in this quantum game. The strategy operations implemented in forming move contracts includes both the classical strategy SC = I, SD = σx and the optimal quantum strategy of the quantum prisoners’ dilemma game[9]SQ = iσz. Although there are still other available operations in a quantum game, the most typical quantum strategy operations are discussed in our study. In addition, these operations are conducted on a maximally entangled quantum state. Hence our cooperation contract ρT is a coherent superposition of different pure strategies, rather than a simple tensor product of each players’ strategies. It is the entanglement and coherence in players’ strategy that makes quantum contract different from the classical one and provides our game with advantage over the classical version.
5. Conclusion
In summary, a new quantum Bayesian prisoners’ dilemma game model is proposed to investigate the influence of tricky actions on quantum games. In this model, players would not only decide their quantum strategies, but also adjust their decision policies for actual moves. We treat the unilateral change of the default decision policy as a player’s tricky action. It is assumed that a player has no certain information about whether his/her opponent is honest or tricky. Instead, the players share common knowledge of the probability of a player being honest, characterized by a parameter of credibility p. A thorough study is carried out to examine the equilibrium of this incomplete information game under typical quantum strategy operations. Although quantum games can not eliminate tricky actions, the two players can always achieve higher payoffs than those in the classical game. For a good proportion of the creditability parameter p, a rational player will take an honest action. This is in contrast to the observation that the players always defect in the classical game. This research suggests that honesty will be promoted to enhance cooperation in social affairs with the assistance of quantum information entanglement and coherence.